In this project, I outline a computational/bibliometric approach to classifying epistemic cultures. As science studies scholar Karin Knorr-Cetina (1999) highlights, science is not a uniform community. Instead, science is broken up into various cultures that produce different kinds of knowledge through diverse ideologies and practices. An epistemic culture refers to “those sets of practices, arrangements and mechanisms bound together by necessity, affinity and historical coincidence which, in a given area of professional expertise, make up how we know what we know” (Knorr-Cetina 2007, 363). Knorr-Cetina highlights the ways that nonhuman objects fit into the production of knowledge in these contexts, noting that the triangular relations between humans and their respective scientific objects often change over time based on social, financial, and technological factors.

Annemarie Mol (2002) too finds that biomedical objects can be enacted in diverse ways and, in turn, mean different things for different social actors. For example, she shows that atherosclerosis is not one object, but a multiplicity that produces distinct material-discursive effects depending on how researchers, clinicians, or patients relate to it. In their new book, Rebecca Jordan-Young and Katrina Karkazis (forthcoming) similarly show that testosterone is also a multiplici-T - a hormone that is constructed in distinct, and often conflicting ways, depending on the assumptions that researchers make about it. Building off of this literature in feminist science studies, I am interested in the distinct biomedical cultures that use testosterone and how to classify scientific cultures that use biomarkers like testosterone in their research.

In my work, I examine how researchers use testosterone across scientific cultures. Below, I outline the reiterative process I followed to classify scientific cultures using of Aria & Cuccurullo’s bibliometrix package in R. bibliometrix provides a set of tools for quantitative research in bibliometrics and scientometrics. Generally, my classification process follows four steps:

Step 1: An Exploratory Analysis of Testosterone Research

In the first step, I generated a sample of all the research that used testosterone over the past 40 years. To do this, I used the search terms “TS=(testosterone)” in the Web of Science (WoS) Core Collection database, limiting my search to English articles and reviews from 1980-2016. This search yielded 58,642 results. Obviously, the size of this literature makes it difficult to know where to start exploring how different types of scientists use testosterone. Thus, the next step was to conduct a search of the most prominent keywords in the overall literature, which I found using the bibliometrix package.

To use bibliometrix, first install and load the the package. Next, convert the WoS files into a dataframe.

#install.packages('bibliometrix')

library(bibliometrix)
## To cite bibliometrix in publications, please use:
## 
## Aria, M. & Cuccurullo, C. (2017) bibliometrix: An R-tool for comprehensive science mapping analysis, Journal of Informetrics, 11(4), pp 959-975, Elsevier.
##                         
## 
## http:\\www.bibliometrix.org
## 
##                         
## To start with the shiny web-interface, please digit:
## biblioshiny()
setwd("C:/Users/soren/Google Drive/Biomedical MultipliciTs/1. Evidence Infrastructure/2. Domain Parsing")

tsearch_raw <- readFiles('tsearch_1-58642.txt')

tsearch <- convert2df(tsearch_raw, dbsource = "isi", format = "plaintext")

# (I've hidden the extraction process to make this file more concise.)

Then use the biblioAnalysis function to find the most prominent authors, most highly cited papers, and most commonly occurring keywords in the literature. You can either explore these results in R or send them to a .csv like I did.

tsearch_bib <- biblioAnalysis(tsearch, sep = ";")
tsearch_stats <- summary(object = tsearch_bib, k = 500, pause = FALSE)

# Explore results in R 

# tsearch_stats$MostRelKeywords

# Output to .csv 

write.table(as.data.frame(tsearch_stats$MostRelKeywords),file="tsearch_keywords.csv", quote=F,sep=",",row.names=F)


# (I've hidden the extraction process to make this file more concise.)

Step 2: Parsing the Literature into Cultural Domains

I opted to split the literature in various cultural domains based on prominent topics that surfaced in the top-500 keywords. For the manual coding process, I took multiple passes at defining the cultural domains in testosterone research. Generally, I tried to classify cultures based on keywords that were coupled to a general health and/or behavioral outcome. For the most part, this was easy. Keywords related to cardiovascular disease or endocrine disrupting chemicals, for example, each fell into their own domains. Other domains, however, were not so clear. In some cases, I found significant overlap between two domains, but ultimately ended up separating them. Examples of this included separating prostate, breast and testicular cancer. It should be noted that while the cultural domains are usually distinct, there is certainly overlap between some domains. It is not possible to separate research on cardiovascular disease and metabolic diseases, despite the fact that most consider these health outcomes distinct domains. In the future, I plan to write more about the implications of this overlap on scientific exchanges.

Before moving on, I also want to highlight two theoretical commitments that shaped my results. Following feminist science studies (Fine 2017; Jordan-Young and Karkazis forthcoming; Oudshoorn 1994; Roberts 2007), I opted not to split testosterone research domains into sex/gender-specific terms. This perpetuates a misleading and dangerous form of essentialism that my work is attempting to critique. Second, following actor-network theory (Latour 2005), I opted not to split animal research into a distinct domain. While the appeal of this was tempting (as you can see in some of my preliminary coding), I ultimately found that animal research was used in almost every domain because Tsearchers use animal models to inform their epistemic models.

Now, some results: After the first stage of splitting, I ended up with 20 different domains of research, including topics like cardiovascular disease, polycystic ovary syndrome, and bone research. However, after lumping terms together, it became obvious that some of these domains were too general. For example, the “development” hub encompassed nearly 50 keywords while most other domains had only around 10 terms. On the other hand, the “cancer” hub only had 10 terms, but produced a literature nearly twice the size of any other domain. Thus, after another round of splitting, I ended with 25 domains of research including:

Aging; Bone; Breast Cancer; Cardiovascular Disease; Dermatology; Disorders of Sex Development and Sex Differences; Endocrine Disrupting Chemicals; Fertility, Infertility, Reproduction, and Sterility; Immunology; Non-Testosterone Interventions; Metabolic Diseases; Methods; Muscle; Neurology/Mental Health; Obesity; Polycystic Ovary Syndrome; Prostate Cancer; Puberty; Quantity; Sexual Medicine; Social Neuroendocrinology; Surgical Procedures; Testicular Cancer; Testosterone Therapies; and Transgender Health.

Step 3: Sensitivity Analyses

To ensure that I included all the relevant search terms for each domain, I decided to conduct a sensitivity analysis akin to Shwed and Bearman’s (2010) paper on scientific consensus. To do this, I inserted all of the domain-specific keywords from Step 2 into a WoS search for each domain. In turn, I followed the same procedure outlined above to download the WoS data, output the top-100 keywords, and include relevant topics that were not included in my original search terms. For example, here are the search terms and sensitivity analyses I conducted for the Cardiovascular Disease Domain.

To carry out this process in R, upload your WoS .txt files and convert them to dataframes.

setwd("C:/Users/soren/Google Drive/Biomedical MultipliciTs/1. Evidence Infrastructure/3. Reiterative Analysis")

# Importing the 25 Domain Datasets

aging <- readLines('aging_6131.txt')  
bone <- readLines('bone_4851.txt')
breast_cancer <- readLines('breast_cancer_1979.txt')
cvd <- readLines('cvd_4175.txt')
derm <- readLines('derm_3252.txt')
dsd <- readLines('dsd_4615.txt')
edcs <- readLines('edcs_3125.txt')
firs <- readLines('firs_5846.txt')
immuno <- readLines('immuno_2631.txt')
interventions <- readLines('interventions_5966.txt')
metabolic <- readLines('metabolic_4912.txt')
methods <- readLines('methods_6261.txt')
muscle <- readLines('muscle_4249.txt')
neuro <- readLines('neuro_5125.txt')
obesity <- readLines('obesity_6489.txt')
pcos <- readLines('pcos_3353.txt')
prostate_cancer <- readLines('prostate_cancer_5438.txt')
puberty <- readLines('puberty_4591.txt')
quantity <- readLines('quantity_6493.txt')
sexmed <- readLines('sexmed_2980.txt')
snet <- readLines('snet_5553.txt')
surgical <- readLines('surgical_6498.txt')
test_cancer <- readLines('testicular_cancer_1345.txt')
testo_therapies <- readLines('testo_therapies_8185.txt')
transhealth <- readLines('transhealth_464.txt')

# Convert to Dataframes

aging <- isi2df(aging)
bone <- isi2df(bone)
breast_cancer <- isi2df(breast_cancer)
cvd <- isi2df(cvd)
derm <- isi2df(derm)
dsd <- isi2df(dsd)
edcs <- isi2df(edcs)
firs <- isi2df(firs)
immuno <- isi2df(immuno)
interventions <- isi2df(interventions)
metabolic <- isi2df(metabolic)
methods <- isi2df(methods)
neuro <- isi2df(neuro)
muscle <- isi2df(muscle)
obesity <- isi2df(obesity)
pcos <- isi2df(pcos)
puberty <- isi2df(puberty)
prostate_cancer <- isi2df(prostate_cancer)
quantity <- isi2df(quantity)
sexmed <- isi2df(sexmed)
snet <- isi2df(snet)
surgical <- isi2df(surgical)
test_cancer <- isi2df(test_cancer)
testo_therapies <- isi2df(testo_therapies)
transhealth <- isi2df(transhealth)

# (I've hidden the extraction process to make this file more concise.)

Then, conduct the bibliometrix analysis for each domain. I’ve provided a sample of the top-25 authors, articles, and keywords for the CVD domain below.

setwd("C:/Users/soren/Google Drive/Biomedical MultipliciTs/1. Evidence Infrastructure/3. Reiterative Analysis")

cvd_bib <- biblioAnalysis(cvd, sep = ";")
cvd_stats <- summary(object = cvd_bib, k = 25, pause = FALSE)
## 
## 
## Main Information about data
## 
##  Documents                             4175 
##  Sources (Journals, Books, etc.)       1085 
##  Keywords Plus (ID)                    7359 
##  Author's Keywords (DE)                4957 
##  Period                                1983 - 2016 
##  Average citations per documents       35.2 
## 
##  Authors                               15515 
##  Author Appearances                    23580 
##  Authors of single authored documents  107 
##  Authors of multi authored documents   15408 
## 
##  Documents per Author                  0.269 
##  Authors per Document                  3.72 
##  Co-Authors per Documents              5.65 
##  Collaboration Index                   3.94 
##  
##  Document types                     
##  B                                     23 
##  J                                     4123 
##  S                                     29 
##  
## 
## Annual Scientific Production
## 
##  Year    Articles
##     1983        3
##     1985        1
##     1989        3
##     1990        7
##     1991       38
##     1992       41
##     1993       51
##     1994       45
##     1995       49
##     1996       75
##     1997       69
##     1998       79
##     1999       79
##     2000       67
##     2001      108
##     2002      115
##     2003      134
##     2004      119
##     2005      152
##     2006      145
##     2007      209
##     2008      219
##     2009      227
##     2010      247
##     2011      278
##     2012      309
##     2013      300
##     2014      308
##     2015      342
##     2016      356
## 
## Annual Percentage Growth Rate 17.90401 
## 
## 
## Most Productive Authors
## 
##     Authors        Articles   Authors        Articles Fractionalized
## 1  MAGGI M               86 JONES TH                           17.81
## 2  CORONA G              71 MAGGI M                            11.97
## 3  JONES TH              60 YEAP BB                            11.08
## 4  FORTI G               48 BASARIA S                          10.87
## 5  MANNUCCI E            45 CORONA G                           10.42
## 6  CHANNER KS            44 BJORNTORP P                         9.71
## 7  BASARIA S             38 SAAD F                              9.32
## 8  BHASIN S              38 CHANNER KS                          9.25
## 9  RASTRELLI G           38 GOOREN L                            8.45
## 10 HANDELSMAN DJ         35 TRAISH AM                           7.60
## 11 YEAP BB               34 BHASIN S                            7.34
## 12 WALLASCHOFSKI H       33 SCHOOLING CM                        6.74
## 13 SAAD F                32 KHALIL RA                           6.42
## 14 HARING R              31 FORTI G                             6.30
## 15 VOLZKE H              29 MORLEY JE                           6.19
## 16 NAUCK M               28 HANDELSMAN DJ                       6.18
## 17 SFORZA A              28 RECKELHOFF JF                       6.07
## 18 DOBS AS               24 BARRETT-CONNOR E                    5.77
## 19 AVERSA A              22 SVARTBERG J                         5.57
## 20 JONES RD              22 MANNUCCI E                          5.54
## 21 SVARTBERG J           21 DAVIS SR                            5.43
## 22 BJORNTORP P           20 DOBS AS                             5.32
## 23 RECKELHOFF JF         20 NIESCHLAG E                         5.21
## 24 VIGNOZZI L            20 AVERSA A                            5.11
## 25 LENZI A               19 ZITZMANN M                          4.94
## 
## 
## Top manuscripts per citations
## 
##                                Paper            TC TCperYear
## 1  HAYES JD, 2005, ANNU REV PHARMACOL         1935     148.8
## 2  HARMAN SM, 2001, J CLIN ENDOCR METAB       1385      81.5
## 3  KNOCHENHAUER ES, 1998, J CLIN ENDOCR METAB 1073      53.6
## 4  FELDMAN HA, 2002, J CLIN ENDOCR METAB       882      55.1
## 5  GRAY A, 1991, J CLIN ENDOCR METAB           740      27.4
## 6  BJORNTORP P, 1991, DIABETES CARE            730      27.0
## 7  BASARIA S, 2010, NEW ENGL J MED             675      84.4
## 8  MENDELSOHN ME, 2005, SCIENCE                616      47.4
## 9  KAUFMAN JM, 2005, ENDOCR REV                613      47.2
## 10 HEIDENREICH A, 2014, EUR UROL               572     143.0
## 11 RECKELHOFF JF, 2001, HYPERTENSION           554      32.6
## 12 ATTARD G, 2008, J CLIN ONCOL                541      54.1
## 13 LAAKSONEN DE, 2004, DIABETES CARE           536      38.3
## 14 TCHERNOF A, 2013, PHYSIOL REV               513     102.6
## 15 APRIDONIDZE T, 2005, J CLIN ENDOCR METAB    505      38.8
## 16 BJORNTORP P, 1996, INT J OBESITY            473      21.5
## 17 LAUGHLIN GA, 2008, J CLIN ENDOCR METAB      444      44.4
## 18 KHAW KT, 2007, CIRCULATION                  440      40.0
## 19 BAUMGARTNER RN, 1999, MECH AGEING DEV       439      23.1
## 20 LIU PY, 2003, ENDOCR REV                    432      28.8
## 21 KAPOOR D, 2006, EUR J ENDOCRINOL            426      35.5
## 22 MARIN P, 1992, INT J OBESITY                422      16.2
## 23 RHODEN EL, 2004, NEW ENGL J MED             421      30.1
## 24 YAGGI HK, 2006, DIABETES CARE               401      33.4
## 25 WU FCW, 2003, ENDOCR REV                    400      26.7
## 
## 
## Most Productive Countries (of corresponding authors)
## 
##         Country   Articles    Freq SCP MCP MCP_Ratio
## 1  USA                1193 0.29147 980 213    0.1785
## 2  ITALY               314 0.07672 257  57    0.1815
## 3  UNITED KINGDOM      265 0.06474 195  70    0.2642
## 4  GERMANY             201 0.04911 137  64    0.3184
## 5  AUSTRALIA           182 0.04447 158  24    0.1319
## 6  CHINA               164 0.04007 133  31    0.1890
## 7  TURKEY              161 0.03934 154   7    0.0435
## 8  JAPAN               136 0.03323 120  16    0.1176
## 9  CANADA              108 0.02639  84  24    0.2222
## 10 SPAIN               106 0.02590  92  14    0.1321
## 11 BRAZIL              105 0.02565  94  11    0.1048
## 12 SWEDEN              103 0.02516  73  30    0.2913
## 13 NETHERLANDS          83 0.02028  68  15    0.1807
## 14 FRANCE               75 0.01832  62  13    0.1733
## 15 POLAND               70 0.01710  60  10    0.1429
## 16 GREECE               68 0.01661  58  10    0.1471
## 17 TAIWAN               67 0.01637  61   6    0.0896
## 18 FINLAND              59 0.01441  49  10    0.1695
## 19 DENMARK              52 0.01270  49   3    0.0577
## 20 KOREA                46 0.01124  41   5    0.1087
## 21 IRAN                 38 0.00928  37   1    0.0263
## 22 NORWAY               36 0.00880  29   7    0.1944
## 23 BELGIUM              33 0.00806  23  10    0.3030
## 24 MEXICO               31 0.00757  26   5    0.1613
## 25 INDIA                29 0.00709  26   3    0.1034
## 
## 
## SCP: Single Country Publications
## 
## MCP: Multiple Country Publications
## 
## 
## Total Citations per Country
## 
##       Country      Total Citations Average Article Citations
## 1  USA                       51866                     43.48
## 2  UNITED KINGDOM            19970                     75.36
## 3  ITALY                     10271                     32.71
## 4  GERMANY                    7848                     39.04
## 5  AUSTRALIA                  6894                     37.88
## 6  SWEDEN                     5972                     57.98
## 7  NETHERLANDS                4225                     50.90
## 8  CANADA                     3415                     31.62
## 9  JAPAN                      2830                     20.81
## 10 TURKEY                     2792                     17.34
## 11 FINLAND                    2654                     44.98
## 12 FRANCE                     2444                     32.59
## 13 SPAIN                      2336                     22.04
## 14 CHINA                      1968                     12.00
## 15 GREECE                     1672                     24.59
## 16 DENMARK                    1618                     31.12
## 17 NORWAY                     1543                     42.86
## 18 BELGIUM                    1449                     43.91
## 19 POLAND                     1392                     19.89
## 20 BRAZIL                     1286                     12.25
## 21 ISRAEL                      918                     38.25
## 22 TAIWAN                      868                     12.96
## 23 AUSTRIA                     587                     24.46
## 24 MEXICO                      577                     18.61
## 25 SWITZERLAND                 520                     18.57
## 
## 
## Most Relevant Sources
## 
##                                                     Sources        Articles
## 1  JOURNAL OF CLINICAL ENDOCRINOLOGY & METABOLISM                       201
## 2  CLINICAL ENDOCRINOLOGY                                               117
## 3  JOURNAL OF SEXUAL MEDICINE                                           117
## 4  EUROPEAN JOURNAL OF ENDOCRINOLOGY                                     83
## 5  JOURNAL OF ENDOCRINOLOGICAL INVESTIGATION                             58
## 6  METABOLISM-CLINICAL AND EXPERIMENTAL                                  55
## 7  ATHEROSCLEROSIS                                                       53
## 8  GYNECOLOGICAL ENDOCRINOLOGY                                           52
## 9  AGING MALE                                                            45
## 10 ENDOCRINOLOGY                                                         45
## 11 HYPERTENSION                                                          45
## 12 PLOS ONE                                                              41
## 13 FERTILITY AND STERILITY                                               40
## 14 AMERICAN JOURNAL OF PHYSIOLOGY-HEART AND CIRCULATORY PHYSIOLOGY       38
## 15 MATURITAS                                                             38
## 16 INTERNATIONAL JOURNAL OF OBESITY                                      33
## 17 HORMONE AND METABOLIC RESEARCH                                        30
## 18 INTERNATIONAL JOURNAL OF IMPOTENCE RESEARCH                           30
## 19 ASIAN JOURNAL OF ANDROLOGY                                            29
## 20 HUMAN REPRODUCTION                                                    29
## 21 EXPERIMENTAL AND CLINICAL ENDOCRINOLOGY & DIABETES                    27
## 22 JOURNAL OF ENDOCRINOLOGY                                              27
## 23 MENOPAUSE-THE JOURNAL OF THE NORTH AMERICAN MENOPAUSE SOCIETY         27
## 24 STEROIDS                                                              27
## 25 AMERICAN JOURNAL OF PHYSIOLOGY-ENDOCRINOLOGY AND METABOLISM           26
## 
## 
## Most Relevant Keywords
## 
##    Author Keywords (DE)      Articles   Keywords-Plus (ID)     Articles
## 1  TESTOSTERONE                   975 CARDIOVASCULAR-DISEASE        827
## 2  ERECTILE DYSFUNCTION           221 TESTOSTERONE                  744
## 3  ANDROGENS                      210 INSULIN-RESISTANCE            555
## 4  METABOLIC SYNDROME             196 MEN                           513
## 5  HYPOGONADISM                   187 METABOLIC SYNDROME            471
## 6  INSULIN RESISTANCE             168 POSTMENOPAUSAL WOMEN          411
## 7  POLYCYSTIC OVARY SYNDROME      163 CORONARY-ARTERY-DISEASE       374
## 8  CARDIOVASCULAR DISEASE         153 MIDDLE-AGED MEN               334
## 9  HYPERTENSION                   137 RISK-FACTORS                  327
## 10 OBESITY                        136 HORMONE-BINDING GLOBULIN      326
## 11 ESTROGEN                       116 RISK                          319
## 12 SEX HORMONES                   114 OLDER MEN                     318
## 13 ESTRADIOL                      102 WOMEN                         283
## 14 ATHEROSCLEROSIS                 99 CORONARY-HEART-DISEASE        273
## 15 PROSTATE CANCER                 93 ELDERLY-MEN                   254
## 16 AGING                           91 SEX-HORMONES                  247
## 17 HORMONES                        86 DISEASE                       243
## 18 ANDROGEN                        82 HEART-DISEASE                 233
## 19 BLOOD PRESSURE                  80 ENDOGENOUS SEX-HORMONES       232
## 20 DIABETES                        75 MYOCARDIAL-INFARCTION         230
## 21 CARDIOVASCULAR RISK             74 BLOOD-PRESSURE                218
## 22 GENDER                          71 REPLACEMENT THERAPY           203
## 23 MENOPAUSE                       65 LOW SERUM TESTOSTERONE        202
## 24 LIPIDS                          62 ASSOCIATION                   200
## 25 PCOS                            62 PREVALENCE                    200

Please note the change in k between the CVD example and my analyses for all 25 domains.

setwd("C:/Users/soren/Google Drive/Biomedical MultipliciTs/1. Evidence Infrastructure/3. Reiterative Analysis")

aging_bib <- biblioAnalysis(aging, sep = ";")
aging_stats <- summary(object = aging_bib, k = 100, pause = FALSE)
write.table(as.data.frame(aging_stats$MostRelKeywords),file="aging_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(aging_stats$AnnualProduction),file="aging_output.csv", quote=F,sep=",",row.names=F)

bone_bib <- biblioAnalysis(bone, sep = ";")
bone_stats <- summary(object = bone_bib, k = 100, pause = FALSE)
write.table(as.data.frame(bone_stats$MostRelKeywords),file="bone_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(bone_stats$AnnualProduction),file="bone_output.csv", quote=F,sep=",",row.names=F)

breast_cancer_bib <- biblioAnalysis(breast_cancer, sep = ";")
breast_cancer_stats <- summary(object = breast_cancer_bib, k = 100, pause = FALSE)
write.table(as.data.frame(breast_cancer_stats$MostRelKeywords),file="breast_cancer_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(breast_cancer_stats$AnnualProduction),file="breast_cancer_output.csv", quote=F,sep=",",row.names=F)

cvd_bib <- biblioAnalysis(cvd, sep = ";")
cvd_stats <- summary(object = cvd_bib, k = 100, pause = FALSE)
write.table(as.data.frame(cvd_stats$MostRelKeywords),file="cvd_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(cvd_stats$AnnualProduction),file="cvd_output.csv", quote=F,sep=",",row.names=F)

derm_bib <- biblioAnalysis(derm, sep = ";")
derm_stats <- summary(object = derm_bib, k = 100, pause = FALSE)
write.table(as.data.frame(derm_stats$MostRelKeywords),file="derm_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(derm_stats$AnnualProduction),file="derm_output.csv", quote=F,sep=",",row.names=F)

dsd_bib <- biblioAnalysis(dsd, sep = ";")
dsd_stats <- summary(object = dsd_bib, k = 100, pause = FALSE)
write.table(as.data.frame(dsd_stats$MostRelKeywords),file="dsd_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(dsd_stats$AnnualProduction),file="dsd_output.csv", quote=F,sep=",",row.names=F)

edcs_bib <- biblioAnalysis(edcs, sep = ";")
edcs_stats <- summary(object = edcs_bib, k = 100, pause = FALSE)
write.table(as.data.frame(edcs_stats$MostRelKeywords),file="edcs_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(edcs_stats$AnnualProduction),file="edcs_output.csv", quote=F,sep=",",row.names=F)

firs_bib <- biblioAnalysis(firs, sep = ";")
firs_stats <- summary(object = firs_bib, k = 100, pause = FALSE)
write.table(as.data.frame(firs_stats$MostRelKeywords),file="firs_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(firs_stats$AnnualProduction),file="firs_output.csv", quote=F,sep=",",row.names=F)

immuno_bib <- biblioAnalysis(immuno, sep = ";")
immuno_stats <- summary(object = immuno_bib, k = 100, pause = FALSE)
write.table(as.data.frame(immuno_stats$MostRelKeywords),file="immuno_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(immuno_stats$AnnualProduction),file="immuno_output.csv", quote=F,sep=",",row.names=F)

interventions_bib <- biblioAnalysis(interventions, sep = ";")
interventions_stats <- summary(object = interventions_bib, k = 100, pause = FALSE)
write.table(as.data.frame(interventions_stats$MostRelKeywords),file="interventions_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(interventions_stats$AnnualProduction),file="interventions_output.csv", quote=F,sep=",",row.names=F)

metabolic_bib <- biblioAnalysis(metabolic, sep = ";")
metabolic_stats <- summary(object = metabolic_bib, k = 100, pause = FALSE)
write.table(as.data.frame(metabolic_stats$MostRelKeywords),file="metabolic_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(metabolic_stats$AnnualProduction),file="metabolic_output.csv", quote=F,sep=",",row.names=F)

methods_bib <- biblioAnalysis(methods, sep = ";")
methods_stats <- summary(object = methods_bib, k = 100, pause = FALSE)
write.table(as.data.frame(methods_stats$MostRelKeywords),file="methods_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(methods_stats$AnnualProduction),file="methods_output.csv", quote=F,sep=",",row.names=F)

muscle_bib <- biblioAnalysis(muscle, sep = ";")
muscle_stats <- summary(object = muscle_bib, k = 100, pause = FALSE)
write.table(as.data.frame(muscle_stats$MostRelKeywords),file="muscle_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(muscle_stats$AnnualProduction),file="muscle_output.csv", quote=F,sep=",",row.names=F)

neuro_bib <- biblioAnalysis(neuro, sep = ";")
neuro_stats <- summary(object = neuro_bib, k = 100, pause = FALSE)
write.table(as.data.frame(neuro_stats$MostRelKeywords),file="neuro_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(neuro_stats$AnnualProduction),file="neuro_output.csv", quote=F,sep=",",row.names=F)

obesity_bib <- biblioAnalysis(obesity, sep = ";")
obesity_stats <- summary(object = obesity_bib, k = 100, pause = FALSE)
write.table(as.data.frame(obesity_stats$MostRelKeywords),file="obesity_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(obesity_stats$AnnualProduction),file="obesity_output.csv", quote=F,sep=",",row.names=F)

pcos_bib <- biblioAnalysis(pcos, sep = ";")
pcos_stats <- summary(object = pcos_bib, k = 100, pause = FALSE)
write.table(as.data.frame(pcos_stats$MostRelKeywords),file="pcos_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(pcos_stats$AnnualProduction),file="pcos_output.csv", quote=F,sep=",",row.names=F)

prostate_cancer_bib <- biblioAnalysis(prostate_cancer, sep = ";")
prostate_cancer_stats <- summary(object = prostate_cancer_bib, k = 100, pause = FALSE)
write.table(as.data.frame(prostate_cancer_stats$MostRelKeywords),file="prostate_cancer_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(prostate_cancer_stats$AnnualProduction),file="prostate_cancer_output.csv", quote=F,sep=",",row.names=F)

puberty_bib <- biblioAnalysis(puberty, sep = ";")
puberty_stats <- summary(object = puberty_bib, k = 100, pause = FALSE)
write.table(as.data.frame(puberty_stats$MostRelKeywords),file="puberty_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(puberty_stats$AnnualProduction),file="puberty_output.csv", quote=F,sep=",",row.names=F)
#plot(x = puberty_bib, k = 10, pause = FALSE)
#options(max.print=1000000)

quantity_bib <- biblioAnalysis(quantity, sep = ";")
quantity_stats <- summary(object = quantity_bib, k = 100, pause = FALSE)
write.table(as.data.frame(quantity_stats$MostRelKeywords),file="quantity_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(quantity_stats$AnnualProduction),file="quantity_output.csv", quote=F,sep=",",row.names=F)

sexmed_bib <- biblioAnalysis(sexmed, sep = ";")
sexmed_stats <- summary(object = sexmed_bib, k = 100, pause = FALSE)
write.table(as.data.frame(sexmed_stats$MostRelKeywords),file="sexmed_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(sexmed_stats$AnnualProduction),file="sexmed_output.csv", quote=F,sep=",",row.names=F)

snet_bib <- biblioAnalysis(snet, sep = ";")
snet_stats <- summary(object = snet_bib, k = 100, pause = FALSE)
write.table(as.data.frame(snet_stats$MostRelKeywords),file="snet_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(snet_stats$AnnualProduction),file="snet_output.csv", quote=F,sep=",",row.names=F)

surgical_bib <- biblioAnalysis(surgical, sep = ";")
surgical_stats <- summary(object = surgical_bib, k = 100, pause = FALSE)
write.table(as.data.frame(surgical_stats$MostRelKeywords),file="surgical_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(surgical_stats$AnnualProduction),file="surgical_output.csv", quote=F,sep=",",row.names=F)

test_cancer_bib <- biblioAnalysis(test_cancer, sep = ";")
test_cancer_stats <- summary(object = test_cancer_bib, k = 100, pause = FALSE)
write.table(as.data.frame(test_cancer_stats$MostRelKeywords),file="testicular_cancer_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(test_cancer_stats$AnnualProduction),file="testicular_cancer_output.csv", quote=F,sep=",",row.names=F)

testo_therapies_bib <- biblioAnalysis(testo_therapies, sep = ";")
testo_therapies_stats <- summary(object = testo_therapies_bib, k = 100, pause = FALSE)
write.table(as.data.frame(testo_therapies_stats$MostRelKeywords),file="testo_therapies_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(testo_therapies_stats$AnnualProduction),file="testo_therapies_output.csv", quote=F,sep=",",row.names=F)

transhealth_bib <- biblioAnalysis(transhealth, sep = ";")
transhealth_stats <- summary(object = transhealth_bib, k = 100, pause = FALSE)
write.table(as.data.frame(transhealth_stats$MostRelKeywords),file="transhealth_keywords.csv", quote=F,sep=",",row.names=F)
write.table(as.data.frame(transhealth_stats$AnnualProduction),file="transhealth_output.csv", quote=F,sep=",",row.names=F)

# (I've hidden the extraction process to make this file more concise.)

After generating the output for each domain, I went through and included all of the relevant keywords in my sensitivity analyses to ensure that the search terms was capturing publications that belong in each domain. Here is a summary of the CVD coding process as an example.

In my summary page of all 25 domains, I have provided basic descriptives of the size each domain was after Step 2 (the “Original Total” column) and Step 3 (the “Final Totals” column). Overall, the average size of the 25 domains is about 5,000 articles (M = 4965.8, SD = 1840.3). Studies that examine disorder of testosterone therapies (n = 8,185), sex development and sex differences (n = 7,112), and aging (n = 6,498) are the largest domains while the trans health literature is by far the smallest of these domains at only 464 articles. For those interested, I have also included a summary of (1) how excluding animal research and (2) including all WoS databases affect the size of each domain.

Step 4: Using bibliometrix to Identify Influence in Each Domain

Now that we have our 25 domains set up, we will apply the same code we just ran to generate information about the most prominent authors, most cited publications, and most commonly occurring keywords in each of the 25 domains. Here is a link to the bibliometrix analyses of each of the 25 domains:

Aging | Bone | Breast Cancer | CV Disease | Dermatology | Disorders of Sex Development | Endocrine Disrupting Chemicals | Fertility | Immunology | Interventions | Metabolic Diseases | Methods | Muscle | Neuro/Mental Health | Obesity | Polycystic Ovary Syndrome | Prostate Cancer | Puberty | Quantity | Sexual Medicine | Social Neuroendocrinology | Surgical | Testicular Cancer | Testosterone Therapies | Trans Health

Results: Mapping Change in Tsearch Over Time (1990-2016)

To finish this exercise, let’s take a look at how each domain changes in size over time. There isn’t a lot of variability during the 1980’s, so let’s use ggplot2 and plotly to graph each domain from 1990-2016.

setwd("C:/Users/soren/Google Drive/Biomedical MultipliciTs/1. Evidence Infrastructure/4. Domain Analysis")

#install.packages('ggplot2')
#install.packages('plotly')

library(ggplot2)
library(plotly)

tsearch_growth <- read.csv("tsearch_growth.csv", stringsAsFactors = FALSE)

tsearch_growth_graph <- ggplot(tsearch_growth, aes(x=year)) + 
  geom_smooth(aes(y = aging, colour = "Aging"), se = FALSE) + 
  geom_smooth(aes(y = bone, colour = "Bone"), se = FALSE) +
  geom_smooth(aes(y = breast_cancer, colour = "Breast Cancer"), se = FALSE) +
  geom_smooth(aes(y = cvd, colour = "CV Disease"), se = FALSE) +
  geom_smooth(aes(y = derm, colour = "Dermatology"),  se = FALSE) +
  geom_smooth(aes(y = dsd, colour = "Dis. Sex Dev."), se = FALSE) +
  geom_smooth(aes(y = edcs, colour = "EDC's"), se = FALSE) +
  geom_smooth(aes(y = fertility, colour = "Fertility"), se = FALSE) +
  geom_smooth(aes(y = immunology, colour = "Immunology"), se = FALSE)+
  geom_smooth(aes(y = interventions, colour = "Interventions"), se = FALSE) +
  geom_smooth(aes(y = metabolic_disease, colour = "Met. Disease"), se = FALSE) +
  geom_smooth(aes(y = methods, colour = "Methods"), se = FALSE) +
  geom_smooth(aes(y = muscle, colour = "Muscle"), se = FALSE) +
  geom_smooth(aes(y = neuro, colour = "Neuro"), se = FALSE) +
  geom_smooth(aes(y = obesity, colour = "Obesity"), se = FALSE) +
  geom_smooth(aes(y = pcos, colour = "PCOS"), method = lm, se = FALSE) +
  geom_smooth(aes(y = prostate_cancer, colour = "Prostate Cancer"), se = FALSE) +
  geom_smooth(aes(y = puberty, colour = "Puberty"), se = FALSE) +
  geom_smooth(aes(y = quantity, colour = "Quantity"), se = FALSE) +
  geom_smooth(aes(y = sexual_medicine, colour = "Sexual Medicine"), se = FALSE) +
  geom_smooth(aes(y = social_neuro, colour = "Soc. Neuroendo."), se = FALSE) +
  geom_smooth(aes(y = surgical, colour = "Surgical"), se = FALSE) +
  geom_smooth(aes(y = testicular_cancer, colour = "Testic. Cancer"), se = FALSE) +
  geom_smooth(aes(y = testo_therapies, colour = "Testo Therapies"), se = FALSE) +
  geom_smooth(aes(y = trans_health, colour = "Trans Health"), se = FALSE) +
  ggtitle("Growth of Testosterone Research from 1990-2016") + labs(x="Year", y="Publication Total") +
  theme(legend.title=element_blank()) + theme(legend.text=element_text(size = 8)) +
  theme(legend.key=element_rect(fill='white')) + theme(panel.background=element_rect(fill = 'grey93')) +
  scale_x_continuous(limits=c(1990, 2016), breaks=seq(1990, 2016, 5)) + scale_y_continuous(breaks=seq(0, 500, 50)) 
  
colors <- tsearch_growth_graph + scale_color_manual(values=c(
                              "#330000", "#660000", "#990000", "#CC3300", "#993300", 
                              "#FF9900", "#FFCC00", "#66CC33", "#339900", "#336600", 
                              "#003333", "#0066CC", "#3366CC", "#3399CC", "#6633CC",  
                              "#660033", "#990066", "#FF99FF", "#9966FF", "#99CCFF", 
                              "#6699CC", "#33CC66", "#CCFF33", "#FFCC00", "#CC6666"))

(gg <- ggplotly(colors))

Overall, we see marked variability in the size of these domains over time. Overall, there is noticeable growth across 20 of these domains from 1990-2016. On the other hand, three of these domains (i.e. dermatology, testicular cancer, and trans health) show much more marginal growth. Lastly, the domains of breast cancer and surgical studies show growth only in the early portion of this window before eventually declining after 2010 and 1996 respectively.

References

Aria, M. & Cuccurullo, C. (2017) “bibliometrix: An R-tool for comprehensive science mapping analysis.” Journal of Informetrics, 11(4), 959-975.

Fine, C. (2017). Testosterone Rex: Myths of Sex, Science, and Society. WW Norton & Company.

Jordan-Young, R. & Karkazis, K. Testosterone: The Unauthorized Biography. Harvard University Press.

Knorr-Cetina, K. (1999). Epistemic Cultures: How the Sciences Make Knowledge. Harvard University Press.

Knorr-Cetina, K. (2007). “Culture in Global Knowledge Societies: Knowledge Cultures and Epistemic Cultures.” Interdisciplinary Science Reviews, 32(4), 361-375.

Mol, A. (2002). The Body Multiple: Ontology in Medical Practice. Duke University Press.

Oudshoorn, N. (2003). Beyond the Natural Body: An Archaeology of Sex Hormones. Routledge.